Overview

Dataset statistics

Number of variables9
Number of observations8991
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory632.3 KiB
Average record size in memory72.0 B

Variable types

Numeric9

Warnings

CO(GT) is highly correlated with PT08.S1(CO) and 7 other fieldsHigh correlation
PT08.S1(CO) is highly correlated with CO(GT) and 7 other fieldsHigh correlation
C6H6(GT) is highly correlated with CO(GT) and 7 other fieldsHigh correlation
PT08.S2(NMHC) is highly correlated with CO(GT) and 7 other fieldsHigh correlation
NOx(GT) is highly correlated with CO(GT) and 6 other fieldsHigh correlation
PT08.S3(NOx) is highly correlated with CO(GT) and 7 other fieldsHigh correlation
NO2(GT) is highly correlated with CO(GT) and 6 other fieldsHigh correlation
PT08.S4(NO2) is highly correlated with CO(GT) and 5 other fieldsHigh correlation
PT08.S5(O3) is highly correlated with CO(GT) and 7 other fieldsHigh correlation
CO(GT) is highly correlated with PT08.S1(CO) and 7 other fieldsHigh correlation
PT08.S1(CO) is highly correlated with CO(GT) and 7 other fieldsHigh correlation
C6H6(GT) is highly correlated with CO(GT) and 7 other fieldsHigh correlation
PT08.S2(NMHC) is highly correlated with CO(GT) and 7 other fieldsHigh correlation
NOx(GT) is highly correlated with CO(GT) and 6 other fieldsHigh correlation
PT08.S3(NOx) is highly correlated with CO(GT) and 7 other fieldsHigh correlation
NO2(GT) is highly correlated with CO(GT) and 6 other fieldsHigh correlation
PT08.S4(NO2) is highly correlated with CO(GT) and 5 other fieldsHigh correlation
PT08.S5(O3) is highly correlated with CO(GT) and 7 other fieldsHigh correlation
CO(GT) is highly correlated with PT08.S1(CO) and 6 other fieldsHigh correlation
PT08.S1(CO) is highly correlated with CO(GT) and 4 other fieldsHigh correlation
C6H6(GT) is highly correlated with CO(GT) and 5 other fieldsHigh correlation
PT08.S2(NMHC) is highly correlated with CO(GT) and 5 other fieldsHigh correlation
NOx(GT) is highly correlated with CO(GT) and 3 other fieldsHigh correlation
PT08.S3(NOx) is highly correlated with CO(GT) and 5 other fieldsHigh correlation
NO2(GT) is highly correlated with CO(GT) and 1 other fieldsHigh correlation
PT08.S4(NO2) is highly correlated with C6H6(GT) and 1 other fieldsHigh correlation
PT08.S5(O3) is highly correlated with CO(GT) and 5 other fieldsHigh correlation
C6H6(GT) is highly correlated with PT08.S4(NO2) and 7 other fieldsHigh correlation
PT08.S4(NO2) is highly correlated with C6H6(GT) and 5 other fieldsHigh correlation
CO(GT) is highly correlated with C6H6(GT) and 7 other fieldsHigh correlation
NOx(GT) is highly correlated with C6H6(GT) and 6 other fieldsHigh correlation
PT08.S2(NMHC) is highly correlated with C6H6(GT) and 7 other fieldsHigh correlation
PT08.S1(CO) is highly correlated with C6H6(GT) and 7 other fieldsHigh correlation
PT08.S5(O3) is highly correlated with C6H6(GT) and 7 other fieldsHigh correlation
NO2(GT) is highly correlated with C6H6(GT) and 6 other fieldsHigh correlation
PT08.S3(NOx) is highly correlated with C6H6(GT) and 7 other fieldsHigh correlation

Reproduction

Analysis started2021-07-01 14:50:42.201667
Analysis finished2021-07-01 14:50:58.141549
Duration15.94 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

CO(GT)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct94
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.069313758
Minimum0.1
Maximum11.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.4 KiB
2021-07-01T16:50:58.324981image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile0.5
Q11.2
median1.8
Q32.6
95-th percentile4.6
Maximum11.9
Range11.8
Interquartile range (IQR)1.4

Descriptive statistics

Standard deviation1.304487151
Coefficient of variation (CV)0.6303960167
Kurtosis4.082047789
Mean2.069313758
Median Absolute Deviation (MAD)0.6
Skewness1.616442422
Sum18605.2
Variance1.701686726
MonotonicityNot monotonic
2021-07-01T16:50:58.543842image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.81822
 
20.3%
1287
 
3.2%
1.4269
 
3.0%
1.5265
 
2.9%
1.6264
 
2.9%
0.7252
 
2.8%
1.1251
 
2.8%
1.3248
 
2.8%
0.8243
 
2.7%
0.9241
 
2.7%
Other values (84)4849
53.9%
ValueCountFrequency (%)
0.133
 
0.4%
0.245
 
0.5%
0.397
 
1.1%
0.4160
1.8%
0.5217
2.4%
0.6238
2.6%
0.7252
2.8%
0.8243
2.7%
0.9241
2.7%
1287
3.2%
ValueCountFrequency (%)
11.91
 
< 0.1%
11.51
 
< 0.1%
10.22
< 0.1%
10.11
 
< 0.1%
9.91
 
< 0.1%
9.51
 
< 0.1%
9.41
 
< 0.1%
9.21
 
< 0.1%
9.11
 
< 0.1%
8.73
< 0.1%

PT08.S1(CO)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1041
Distinct (%)11.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1099.833166
Minimum647
Maximum2040
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.4 KiB
2021-07-01T16:50:58.745274image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum647
5-th percentile810.5
Q1937
median1063
Q31231
95-th percentile1508
Maximum2040
Range1393
Interquartile range (IQR)294

Descriptive statistics

Standard deviation217.0800373
Coefficient of variation (CV)0.1973754237
Kurtosis0.3351286502
Mean1099.833166
Median Absolute Deviation (MAD)142
Skewness0.7559073724
Sum9888600
Variance47123.74258
MonotonicityNot monotonic
2021-07-01T16:50:58.964147image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
97330
 
0.3%
110028
 
0.3%
92526
 
0.3%
96926
 
0.3%
93826
 
0.3%
98826
 
0.3%
97025
 
0.3%
105325
 
0.3%
96625
 
0.3%
98725
 
0.3%
Other values (1031)8729
97.1%
ValueCountFrequency (%)
6471
 
< 0.1%
6491
 
< 0.1%
6551
 
< 0.1%
6673
< 0.1%
6691
 
< 0.1%
6761
 
< 0.1%
6781
 
< 0.1%
6791
 
< 0.1%
6811
 
< 0.1%
6832
< 0.1%
ValueCountFrequency (%)
20401
< 0.1%
20081
< 0.1%
19821
< 0.1%
19751
< 0.1%
19731
< 0.1%
19611
< 0.1%
19561
< 0.1%
19341
< 0.1%
19181
< 0.1%
19171
< 0.1%

C6H6(GT)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct407
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.08310533
Minimum0.1
Maximum63.7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.4 KiB
2021-07-01T16:50:59.175612image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.1
5-th percentile1.7
Q14.4
median8.2
Q314
95-th percentile24.65
Maximum63.7
Range63.6
Interquartile range (IQR)9.6

Descriptive statistics

Standard deviation7.449819698
Coefficient of variation (CV)0.7388418008
Kurtosis2.488705886
Mean10.08310533
Median Absolute Deviation (MAD)4.4
Skewness1.36153227
Sum90657.2
Variance55.49981354
MonotonicityNot monotonic
2021-07-01T16:50:59.391113image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3.684
 
0.9%
2.882
 
0.9%
3.879
 
0.9%
478
 
0.9%
3.177
 
0.9%
376
 
0.8%
2.575
 
0.8%
2.973
 
0.8%
5.472
 
0.8%
671
 
0.8%
Other values (397)8224
91.5%
ValueCountFrequency (%)
0.12
 
< 0.1%
0.28
 
0.1%
0.310
 
0.1%
0.414
0.2%
0.520
0.2%
0.623
0.3%
0.731
0.3%
0.825
0.3%
0.925
0.3%
130
0.3%
ValueCountFrequency (%)
63.71
< 0.1%
52.11
< 0.1%
50.81
< 0.1%
50.71
< 0.1%
50.61
< 0.1%
49.51
< 0.1%
49.41
< 0.1%
48.21
< 0.1%
47.71
< 0.1%
47.51
< 0.1%

PT08.S2(NMHC)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1245
Distinct (%)13.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean939.1533756
Minimum383
Maximum2214
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.4 KiB
2021-07-01T16:50:59.630187image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum383
5-th percentile562
Q1734.5
median909
Q31116
95-th percentile1420
Maximum2214
Range1831
Interquartile range (IQR)381.5

Descriptive statistics

Standard deviation266.8314286
Coefficient of variation (CV)0.2841191179
Kurtosis0.06324387318
Mean939.1533756
Median Absolute Deviation (MAD)188
Skewness0.56156598
Sum8443928
Variance71199.01129
MonotonicityNot monotonic
2021-07-01T16:50:59.868136image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
85325
 
0.3%
85923
 
0.3%
88023
 
0.3%
80023
 
0.3%
98522
 
0.2%
76921
 
0.2%
85021
 
0.2%
77621
 
0.2%
78321
 
0.2%
101220
 
0.2%
Other values (1235)8771
97.6%
ValueCountFrequency (%)
3832
< 0.1%
3871
< 0.1%
3881
< 0.1%
3902
< 0.1%
3971
< 0.1%
3991
< 0.1%
4022
< 0.1%
4072
< 0.1%
4081
< 0.1%
4091
< 0.1%
ValueCountFrequency (%)
22141
< 0.1%
20071
< 0.1%
19831
< 0.1%
19811
< 0.1%
19801
< 0.1%
19591
< 0.1%
19581
< 0.1%
19351
< 0.1%
19241
< 0.1%
19201
< 0.1%

NOx(GT)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct898
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean230.8021355
Minimum2
Maximum1479
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.4 KiB
2021-07-01T16:51:00.102223image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile41
Q1112
median178
Q3280
95-th percentile635
Maximum1479
Range1477
Interquartile range (IQR)168

Descriptive statistics

Standard deviation188.7172102
Coefficient of variation (CV)0.8176579901
Kurtosis4.952450426
Mean230.8021355
Median Absolute Deviation (MAD)76
Skewness1.995177294
Sum2075142
Variance35614.18543
MonotonicityNot monotonic
2021-07-01T16:51:00.372045image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1781617
 
18.0%
8939
 
0.4%
9336
 
0.4%
6535
 
0.4%
18035
 
0.4%
13235
 
0.4%
12235
 
0.4%
9534
 
0.4%
4134
 
0.4%
5133
 
0.4%
Other values (888)7058
78.5%
ValueCountFrequency (%)
21
 
< 0.1%
41
 
< 0.1%
61
 
< 0.1%
71
 
< 0.1%
81
 
< 0.1%
91
 
< 0.1%
103
< 0.1%
114
< 0.1%
124
< 0.1%
134
< 0.1%
ValueCountFrequency (%)
14791
< 0.1%
13892
< 0.1%
13691
< 0.1%
13581
< 0.1%
13451
< 0.1%
13011
< 0.1%
12901
< 0.1%
12471
< 0.1%
12301
< 0.1%
12201
< 0.1%

PT08.S3(NOx)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1221
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean835.4936047
Minimum322
Maximum2683
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.4 KiB
2021-07-01T16:51:00.684455image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum322
5-th percentile483
Q1658
median806
Q3969.5
95-th percentile1291
Maximum2683
Range2361
Interquartile range (IQR)311.5

Descriptive statistics

Standard deviation256.81732
Coefficient of variation (CV)0.3073839447
Kurtosis2.677558895
Mean835.4936047
Median Absolute Deviation (MAD)155
Skewness1.101729235
Sum7511923
Variance65955.13586
MonotonicityNot monotonic
2021-07-01T16:51:00.920705image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
84625
 
0.3%
76725
 
0.3%
73325
 
0.3%
87623
 
0.3%
76523
 
0.3%
68522
 
0.2%
83022
 
0.2%
87222
 
0.2%
81622
 
0.2%
72022
 
0.2%
Other values (1211)8760
97.4%
ValueCountFrequency (%)
3221
< 0.1%
3252
< 0.1%
3281
< 0.1%
3302
< 0.1%
3341
< 0.1%
3351
< 0.1%
3402
< 0.1%
3411
< 0.1%
3451
< 0.1%
3461
< 0.1%
ValueCountFrequency (%)
26831
< 0.1%
25591
< 0.1%
25421
< 0.1%
23311
< 0.1%
23271
< 0.1%
23181
< 0.1%
22941
< 0.1%
21211
< 0.1%
20952
< 0.1%
20811
< 0.1%

NO2(GT)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct274
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean111.5861417
Minimum2
Maximum333
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.4 KiB
2021-07-01T16:51:01.129713image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile45
Q185
median109
Q3132
95-th percentile193
Maximum333
Range331
Interquartile range (IQR)47

Descriptive statistics

Standard deviation43.2058078
Coefficient of variation (CV)0.3871968969
Kurtosis1.07474443
Mean111.5861417
Median Absolute Deviation (MAD)23
Skewness0.6708264814
Sum1003271
Variance1866.741828
MonotonicityNot monotonic
2021-07-01T16:51:01.368390image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1091659
 
18.5%
9776
 
0.8%
11974
 
0.8%
11474
 
0.8%
10174
 
0.8%
11073
 
0.8%
11773
 
0.8%
9573
 
0.8%
11570
 
0.8%
11669
 
0.8%
Other values (264)6676
74.3%
ValueCountFrequency (%)
21
 
< 0.1%
31
 
< 0.1%
52
 
< 0.1%
71
 
< 0.1%
82
 
< 0.1%
92
 
< 0.1%
112
 
< 0.1%
122
 
< 0.1%
131
 
< 0.1%
145
0.1%
ValueCountFrequency (%)
3331
 
< 0.1%
3221
 
< 0.1%
3101
 
< 0.1%
3091
 
< 0.1%
3061
 
< 0.1%
3011
 
< 0.1%
2951
 
< 0.1%
2882
< 0.1%
2851
 
< 0.1%
2833
< 0.1%

PT08.S4(NO2)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1603
Distinct (%)17.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1456.264598
Minimum551
Maximum2775
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.4 KiB
2021-07-01T16:51:01.610481image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum551
5-th percentile883
Q11227
median1463
Q31674
95-th percentile2029
Maximum2775
Range2224
Interquartile range (IQR)447

Descriptive statistics

Standard deviation346.2067935
Coefficient of variation (CV)0.2377361875
Kurtosis0.07801862433
Mean1456.264598
Median Absolute Deviation (MAD)221
Skewness0.2053885254
Sum13093275
Variance119859.1439
MonotonicityNot monotonic
2021-07-01T16:51:01.815282image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
148824
 
0.3%
158022
 
0.2%
153921
 
0.2%
146720
 
0.2%
163819
 
0.2%
149018
 
0.2%
141818
 
0.2%
132117
 
0.2%
151117
 
0.2%
143517
 
0.2%
Other values (1593)8798
97.9%
ValueCountFrequency (%)
5511
< 0.1%
5591
< 0.1%
5611
< 0.1%
5791
< 0.1%
6011
< 0.1%
6021
< 0.1%
6051
< 0.1%
6211
< 0.1%
6371
< 0.1%
6401
< 0.1%
ValueCountFrequency (%)
27751
< 0.1%
27461
< 0.1%
26911
< 0.1%
26841
< 0.1%
26791
< 0.1%
26671
< 0.1%
26651
< 0.1%
26621
< 0.1%
26432
< 0.1%
26412
< 0.1%

PT08.S5(O3)
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1743
Distinct (%)19.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1022.906128
Minimum221
Maximum2523
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size70.4 KiB
2021-07-01T16:51:02.022713image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum221
5-th percentile461
Q1731.5
median963
Q31273.5
95-th percentile1761.5
Maximum2523
Range2302
Interquartile range (IQR)542

Descriptive statistics

Standard deviation398.4842877
Coefficient of variation (CV)0.3895609545
Kurtosis0.07861233923
Mean1022.906128
Median Absolute Deviation (MAD)261
Skewness0.6278644976
Sum9196949
Variance158789.7276
MonotonicityNot monotonic
2021-07-01T16:51:02.235346image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
83620
 
0.2%
82520
 
0.2%
82619
 
0.2%
92618
 
0.2%
79917
 
0.2%
77717
 
0.2%
89116
 
0.2%
90516
 
0.2%
94916
 
0.2%
92316
 
0.2%
Other values (1733)8816
98.1%
ValueCountFrequency (%)
2211
< 0.1%
2251
< 0.1%
2271
< 0.1%
2321
< 0.1%
2521
< 0.1%
2531
< 0.1%
2571
< 0.1%
2612
< 0.1%
2621
< 0.1%
2631
< 0.1%
ValueCountFrequency (%)
25231
< 0.1%
25221
< 0.1%
25191
< 0.1%
25151
< 0.1%
24941
< 0.1%
24801
< 0.1%
24751
< 0.1%
24651
< 0.1%
24521
< 0.1%
24341
< 0.1%

Interactions

2021-07-01T16:50:43.087364image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:43.351357image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:43.518595image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:43.675449image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:43.848793image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:44.021428image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:44.204304image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:44.372472image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:44.543717image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:44.710704image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:44.877577image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:45.071152image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:45.258484image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:45.468224image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:45.664383image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:45.855550image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:46.038650image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:46.209942image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:46.385006image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:46.559128image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:46.744062image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:46.917449image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:47.080604image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:47.268885image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:47.445337image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:47.607547image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:47.839050image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:48.008277image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:48.208105image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:48.393954image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:48.571238image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:48.764226image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:48.948322image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:49.110992image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:49.319518image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:49.508442image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:49.680220image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:49.853792image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:50.040524image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:50.207990image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:50.418668image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:50.594979image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:50.777914image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:50.972245image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:51.140242image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:51.325623image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:51.484313image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:51.648840image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:51.813184image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:51.972359image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:52.159321image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:52.306962image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:52.466874image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:52.605319image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:52.765782image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:52.921612image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:53.114761image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:53.280671image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:53.470596image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:53.649417image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:53.796218image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:53.970168image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:54.129000image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:54.307495image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:54.481419image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:54.657802image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:54.828467image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:55.016432image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:55.181744image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:55.350223image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:55.518971image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:55.670802image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:55.831832image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:55.996408image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:56.163995image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:56.320367image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:56.506975image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:56.732182image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:56.956172image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:57.167275image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-07-01T16:50:57.355922image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-07-01T16:51:02.436285image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-01T16:51:02.701459image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-01T16:51:02.996547image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-01T16:51:03.283161image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-07-01T16:50:57.696181image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-07-01T16:50:58.008474image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

CO(GT)PT08.S1(CO)C6H6(GT)PT08.S2(NMHC)NOx(GT)PT08.S3(NOx)NO2(GT)PT08.S4(NO2)PT08.S5(O3)
02.61360.011.91046.0166.01056.0113.01692.01268.0
12.01292.09.4955.0103.01174.092.01559.0972.0
22.21402.09.0939.0131.01140.0114.01555.01074.0
32.21376.09.2948.0172.01092.0122.01584.01203.0
41.61272.06.5836.0131.01205.0116.01490.01110.0
51.21197.04.7750.089.01337.096.01393.0949.0
61.21185.03.6690.062.01462.077.01333.0733.0
71.01136.03.3672.062.01453.076.01333.0730.0
80.91094.02.3609.045.01579.060.01276.0620.0
90.61010.01.7561.0178.01705.0109.01235.0501.0

Last rows

CO(GT)PT08.S1(CO)C6H6(GT)PT08.S2(NMHC)NOx(GT)PT08.S3(NOx)NO2(GT)PT08.S4(NO2)PT08.S5(O3)
89810.5888.01.3528.077.01077.053.0987.0578.0
89821.11031.04.4730.0182.0760.093.01129.0905.0
89834.01384.017.41221.0594.0470.0155.01600.01457.0
89845.01446.022.41362.0586.0415.0174.01777.01705.0
89853.91297.013.61102.0523.0507.0187.01375.01583.0
89863.11314.013.51101.0472.0539.0190.01374.01729.0
89872.41163.011.41027.0353.0604.0179.01264.01269.0
89882.41142.012.41063.0293.0603.0175.01241.01092.0
89892.11003.09.5961.0235.0702.0156.01041.0770.0
89902.21071.011.91047.0265.0654.0168.01129.0816.0